Schema Normalization for Improving Schema Matching

نویسندگان

  • Serena Sorrentino
  • Sonia Bergamaschi
  • Maciej Gawinecki
  • Laura Po
چکیده

Schema matching is the problem of finding relationships among concepts across heterogeneous data sources (heterogeneous in format and in structure). Starting from the “hidden meaning” associated to schema labels (i.e. class/attribute names) it is possible to discover relationships among the elements of different schemata. Lexical annotation (i.e. annotation w.r.t. a thesaurus/lexical resource) helps in associating a “meaning” to schema labels. However, performance of semi-automatic lexical annotation methods on realworld schemata suffers from the abundance of non-dictionary words such as compound nouns, abbreviations and acronyms. We address this problem by proposing a method to perform schema label normalization which increases the number of comparable labels. The method semi-automatically expands abbreviations/acronyms and annotates compound nouns, with a minimal manual effort. We empirically prove that our normalization method helps in the identification of similarities among schema elements of different data sources, thus improving schema matching results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Schema label normalization for improving schema matching

Schema matching is the problem of finding relationships among concepts across heterogeneous data sources that are heterogeneous in format and in structure. Starting from the “hidden meaning” associated with schema labels (i.e. class/attribute names) it is possible to discover relationships among the elements of different schemata. Lexical annotation (i.e. annotation w.r.t. a thesaurus/lexical r...

متن کامل

An Improved Semantic Schema Matching Approach

Schema matching is a critical step in many applications, such as data warehouse loading, Online Analytical Process (OLAP), Data mining, semantic web [2] and schema integration. This task is defined for finding the semantic correspondences between elements of two schemas. Recently, schema matching has found considerable interest in both research and practice. In this paper, we present a new impr...

متن کامل

Automatic generation of probabilistic relationships for improving schema matching

Schema matching is the problem of finding relationships among concepts across data sources that are heterogeneous in format and in structure. Starting from the ‘‘hidden meaning’’ associated with schema labels (i.e. class/attribute names), it is possible to discover lexical relationships among the elements of different schemata. In this work, we propose an automatic method aimed at discovering p...

متن کامل

A Survey of Schema Matching Research using Database Schemas and Instances

Schema matching is considered as one of the essential phases of data integration in database systems. The main aim of the schema matching process is to identify the correlation between schema which helps later in the data integration process. The main issue concern of schema matching is how to support the merging decision by providing the correspondence between attributes through syntactic and ...

متن کامل

Improving Real World Schema Matching with Decomposition Process

This paper tends to provide an answer to a difficult problem: Matching large XML schemas. Scalable Matching acquires a long execution time other than decreasing the quality of matches. In this paper, we propose an XML schema decomposition approach as a solution for large schema matching problem. The presented approach identifies the common structures between and within XML schemas, and decompos...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009